AITopics | Orange

Collaborating Authors

Orange

Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

arXiv.org Artificial IntelligenceJan-11-2025

Semantic leakage is a phenomenon recently introduced by Gonen et al. (2024). It refers to a situation in which associations learnt from the training data emerge in language model generations in an unexpected and sometimes undesired way. Prior work has focused on leakage in large language models (7B+ parameters). In this study, I use Qwen2.5 model family to explore whether smaller models, ranging from 500M to 7B parameters, demonstrate less semantic leakage due to their limited capacity for capturing complex associations. Building on the previous dataset from Gonen et al. (2024), I introduce a new dataset of color-focused prompts, categorized into specific types of semantic associations, to systematically evaluate the models' performance. Results indicate that smaller models exhibit less semantic leakage overall, although this trend is not strictly linear, with medium-sized models sometimes surpassing larger ones in leaking behavior. The dataset, the model generations, and the evaluation code are publicly available at https://github.com/smilni/semantic_leakage_project.

large language model, machine learning, qwen2, (19 more...)

arXiv.org Artificial Intelligence

2501.06638

Country:

North America > United States > West Virginia (0.04)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Texas > Orange County > Orange (0.04)
(11 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

ClassContrast: Bridging the Spatial and Contextual Gaps for Node Representations

Uddin, Md Joshem, Tola, Astrit, Sikand, Varin, Akcora, Cuneyt Gurcan, Coskunuzer, Baris

arXiv.org Machine LearningOct-2-2024

Graph Neural Networks (GNNs) have revolutionized the domain of graph representation learning by utilizing neighborhood aggregation schemes in many popular architectures, such as message passing graph neural networks (MPGNNs). This scheme involves iteratively calculating a node's representation vector by aggregating and transforming the representation vectors of its adjacent nodes. Despite their effectiveness, MPGNNs face significant issues, such as oversquashing, oversmoothing, and underreaching, which hamper their effectiveness. Additionally, the reliance of MPGNNs on the homophily assumption, where edges typically connect nodes with similar labels and features, limits their performance in heterophilic contexts, where connected nodes often have significant differences. This necessitates the development of models that can operate effectively in both homophilic and heterophilic settings. In this paper, we propose a novel approach, ClassContrast, grounded in Energy Landscape Theory from Chemical Physics, to overcome these limitations. ClassContrast combines spatial and contextual information, leveraging a physics-inspired energy landscape to model node embeddings that are both discriminative and robust across homophilic and heterophilic settings. Our approach introduces contrast-based homophily matrices to enhance the understanding of class interactions and tendencies. Through extensive experiments, we demonstrate that ClassContrast outperforms traditional GNNs in node classification and link prediction tasks, proving its effectiveness and versatility in diverse real-world scenarios.

dataset, node, vector, (16 more...)

arXiv.org Machine Learning

2410.02158

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
North America > United States > Wisconsin (0.05)
Europe > United Kingdom > Wales (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

OpportunityFinder: A Framework for Automated Causal Inference

Nguyen, Huy, Grover, Prince, Khatwani, Devashish

arXiv.org Artificial IntelligenceSep-22-2023

We introduce OpportunityFinder, a code-less framework for performing a variety of causal inference studies with panel data for non-expert users. In its current state, OpportunityFinder only requires users to provide raw observational data and a configuration file. A pipeline is then triggered that inspects/processes data, chooses the suitable algorithm(s) to execute the causal study. It returns the causal impact of the treatment on the configured outcome, together with sensitivity and robustness results. Causal inference is widely studied and used to estimate the downstream impact of individual's interactions with products and features. It is common that these causal studies are performed by scientists and/or economists periodically. Business stakeholders are often bottle-necked on scientist or economist bandwidth to conduct causal studies. We offer OpportunityFinder as a solution for commonly performed causal studies with four key features: (1) easy to use for both Business Analysts and Scientists, (2) abstraction of multiple algorithms under a single I/O interface, (3) support for causal impact analysis under binary treatment with panel data and (4) dynamic selection of algorithm based on scale of data.

artificial intelligence, machine learning, opportunityfinder, (14 more...)

arXiv.org Artificial Intelligence

2309.13103

Country:

North America > United States > Texas > Orange County > Orange (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Representation Learning with Weighted Inner Product for Universal Approximation of General Similarities

Kim, Geewook, Okuno, Akifumi, Fukui, Kazuki, Shimodaira, Hidetoshi

arXiv.org Machine LearningFeb-27-2019

We propose $\textit{weighted inner product similarity}$ (WIPS) for neural-network based graph embedding, where we optimize the weights of the inner product in addition to the parameters of neural networks. Despite its simplicity, WIPS can approximate arbitrary general similarities including positive definite, conditionally positive definite, and indefinite kernels. WIPS is free from similarity model selection, yet it can learn any similarity models such as cosine similarity, negative Poincar\'e distance and negative Wasserstein distance. Our extensive experiments show that the proposed method can learn high-quality distributed representations of nodes from real datasets, leading to an accurate approximation of similarities as well as high performance in inductive tasks.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1902.10409

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > Wisconsin (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback